Diffix: High-Utility Database Anonymization

نویسندگان

  • Paul Francis
  • Sebastian Probst Eide
  • Reinhard Munz
چکیده

In spite of the tremendous privacy and liability benefits of anonymization, most shared data today is only pseudonymized. The reason is simple: there haven’t been any anonymization technologies that are general purpose, easy to use, and preserve data quality. This paper presents the design of Diffix, a new approach to database anonymization that promises to break new ground in the utility/privacy trade-off. Diffix acts as an SQL proxy between the analyst and an unmodified live database. Diffix adds a minimal amount of noise to answers—Gaussian with a standard deviation of only two for counting queries—and places no limit on the number of queries an analyst may make. Diffix works with any type of data and configuration is simple and data-independent: the administrator does not need to consider the identifiability or sensitivity of the data itself. This paper presents a high-level but complete description of Diffix. It motivates the design through examples of attacks and defenses, and provides some evidence for how Diffix can provide strong anonymity with such low noise levels.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Effective Method for Utility Preserving Social Network Graph Anonymization Based on Mathematical Modeling

In recent years, privacy concerns about social network graph data publishing has increased due to the widespread use of such data for research purposes. This paper addresses the problem of identity disclosure risk of a node assuming that the adversary identifies one of its immediate neighbors in the published data. The related anonymity level of a graph is formulated and a mathematical model is...

متن کامل

Data attribute security and privacy in Collaborative distributed database Publishing

In this era, there are need to secure data in distributed database system. For collaborative data publishing some anonymization techniques are available such as generalization and bucketization. We consider the attack can call as “insider attack” by colluding data providers who may use their own records to infer others records. To protect our database from these types of attacks we used slicing...

متن کامل

The Boundary Between Privacy and Utility in Data Publishing

We consider the privacy problem in data publishing: given a database instance containing sensitive information “anonymize” it to obtain a view such that, on one hand attackers cannot learn any sensitive information from the view, and on the other hand legitimate users can use it to compute useful statistics. These are conflicting goals. In this paper we prove an almost crisp separation of the c...

متن کامل

Privacy vs. Utility in Anonymized Data

We investigate the privacy and utility aspects of k-anonymity, which has received much research attention since its introduction in [Sweeney, 2002]. Meyerson and Williams [2004] showed that finding an optimal k-anonymization is NP-hard and developed a first approximation algorithm. Further algorithms with different approximation guarantees have been proposed, but it remains hard to compare thes...

متن کامل

Quantification of De-anonymization Risks in Social Networks

The risks of publishing privacy-sensitive data have received considerable attention recently. Several deanonymization attacks have been proposed to re-identify individuals even if data anonymization techniques were applied. However, there is no theoretical quantification for relating the data utility that is preserved by the anonymization techniques and the data vulnerability against de-anonymi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017